IN-LINE INTERRUPT HANDLING AND LOCK-UP FREE TLBs

نویسندگان

Aamer Jaleel

Bruce L. Jacob

Donald Yeung

چکیده

Title of Thesis: In-line Interrupt Handling and Lockup Free TLBs Degree Candidate: Aamer Jaleel Degree and Year: Master of Science, 2002 Thesis directed by: Dr. Bruce L. Jacob Department of Electrical and Computer Engineering The effects of the general-purpose precise interrupt mechanisms in use for the past few decades have received very little attention. When modern out-of-order processors handle interrupts precisely, they typically begin by flushing the pipeline to make the CPU available to execute handler instructions. In doing so, the CPU ends up flushing many instructions that have been brought in to the reorder buffer. In particular, these instructions may have reached a very deep stage in the pipeline—representing significant work that is wasted. In addition, an overhead of several cycles and wastage of energy (per exception detected) can be expected in re-fetching and re-executing the instructions flushed. This thesis concentrates on improving the performance of precisely handling software managed translation lookaside buffer (TLB) interrupts, one of the most frequently occurring interrupts. The thesis presents a novel method of in-lining the interrupt handler within the reorder buffer. Since the first level interrupt-handlers of TLBs are usually small, they could potentially fit in the reorder buffer along with the user-level code already there. In doing so, the instructions that would otherwise be flushed from the pipe need not be re-fetched and re-executed. Additionally, it allows for instructions independent of the exceptional instruction to continue to execute in parallel with the handler code. By in-lining the TLB interrupt handler this provides lock-up free TLBs. This thesis proposes the prepend and append schemes of inlining the interrupt handler into the available reorder buffer space. The two schemes are implemented on a processor with a 4-way out-of-order core similar to the Alpha 21264. We compare the overhead and performance impact of handling TLB interrupts by the traditional scheme, the append in-lined scheme, and the prepend in-lined scheme. For small, medium, and large memory footprints, the overhead is quantified by comparing the number and pipeline state of instructions flushed, the energy savings, and the performance improvements. We find that, lock-up free TLBs reduce the overhead of re-fetching and re-executing the instructions by 30-95%, reduce the energy consumption and execution time by 5-25%, and also reduce the energy wasted by 30-90%. IN-LINE INTERRUPT HANDLING AND

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the Design and Implementation of an Efficient Lock-Free Scheduler

Schedulers for symmetric multiprocessing (SMP) machines use sophisticated algorithms to schedule processes onto the available processor cores. Hardware-dependent code and the use of locks to protect shared data structures from simultaneous access lead to poor portability, the difficulty to prove correctness, and a myriad of problems associated with locking such as limiting the available paralle...

متن کامل

Improving the Precise Interrupt Mechanism of Software-Managed TLB Miss Handlers

The effects of the general-purpose precise interrupt mechanisms in use for the past few decades have received very little attention. When modern out-of-order processors handle interrupts precisely, they typically begin by flushing the pipeline to make the CPU available to execute handler instructions. In doing so, the CPU ends up flushing many instructions that have been brought in to the reord...

متن کامل

Performance results from SALMON, a cluster of Workstations Connected by SCI

SCI (Scalable Coherent Interface, IEEE Standard no.1596) defines a standard for high speed interconnection. We present simple throughput and latency results using different implementation strategies and configurations of SparcStations interconnected using Sbus/SCI adapters. These results are compared with similar results obtained from running TCP/IP over Ethernet and ATM. The Sbus/SCI interface...

متن کامل

Performance and the Single/multi{processor Operating System Process Subsystem

Operating systems depend on process subsystem performance, which we analyze in Choices. We use compile time specialization to optimize both single and multiprocessor performance within a single design. We partition critical sections into two types, and compose independent control mechanisms to produce high performance, specialized locks. We select heavily used process operations and justify new...

متن کامل

Implementation and Performance Evaluation of M-VIA on AceNIC Gigabit Ethernet Card

This paper describes the implementation and performance of M-VIA on the AceNIC Gigabit Ethernet card. The AceNIC adapter has several notable hardware features for high-speed communication, such as jumbo frames and interrupt coalescing. The M-VIA performance characteristics were measured and evaluated based on these hardware features. Our results show that latency and bandwidth improvement can b...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2002

IN-LINE INTERRUPT HANDLING AND LOCK-UP FREE TLBs

نویسندگان

چکیده

منابع مشابه

On the Design and Implementation of an Efficient Lock-Free Scheduler

Improving the Precise Interrupt Mechanism of Software-Managed TLB Miss Handlers

Performance results from SALMON, a cluster of Workstations Connected by SCI

Performance and the Single/multi{processor Operating System Process Subsystem

Implementation and Performance Evaluation of M-VIA on AceNIC Gigabit Ethernet Card

عنوان ژورنال:

اشتراک گذاری